Fast Offline Policy Optimization for Large Scale Recommendation
نویسندگان
چکیده
Personalised interactive systems such as recommender require selecting relevant items from massive catalogs dependent on context. Reward-driven offline optimisation of these can be achieved by a relaxation the discrete problem resulting in policy learning or REINFORCE style algorithms. Unfortunately, this step requires computing sum over entire catalogue making complexity evaluation gradient (and hence each stochastic descent iterations) linear size. This calculation is untenable many real world examples large systems, severely limiting usefulness method practice. In paper, we derive an approximation algorithms that scale logarithmically with Our contribution based upon combining three novel ideas: new Monte Carlo estimate policy, self normalised importance sampling estimator and use fast maximum inner product search at training time. Extensive experiments show our algorithm order magnitude faster than naive approaches yet produces equally good policies.
منابع مشابه
Cross-Domain Recommendation for Large-Scale Data
Cross-domain algorithms have been introduced to help improving recommendations and to alleviate cold-start problem, especially in small and sparse datasets. These algorithms work by transferring information from source domain(s) to target domain. In this paper, we study if such algorithms can be helpful for large-scale datasets. We introduce a large-scale cross-domain recommender algorithm deri...
متن کاملCascading Bandits for Large-Scale Recommendation Problems
Most recommender systems recommend a list of items. The user examines the list, from the first item to the last, and often chooses the first attractive item and does not examine the rest. This type of user behavior can be modeled by the cascade model. In this work, we study cascading bandits, an online learning variant of the cascade model where the goal is to recommend K most attractive items ...
متن کاملFast Large-Scale Spectral Clustering by Sequential Shrinkage Optimization
In many applications, we need to cluster largescale data objects. However, some recently proposed clustering algorithms such as spectral clustering can hardly handle large-scale applications due to the complexity issue, although their effectiveness has been demonstrated in many previous work. In this paper, we propose a fast solver for spectral clustering. In contrast to traditional spectral cl...
متن کاملAn Efficient Parameter-Free Method for Large Scale Offline Learning
With the rapid growth of computer storage capacities, available data and demand for scoring models both follow an increasing trend, sharper than that of the processing power. However, the main limitation to a wide spread of data mining solutions is the non-increasing availability of skilled data analysts, which play a key role in data preparation and model selection. In this paper we present a ...
متن کاملA limited memory adaptive trust-region approach for large-scale unconstrained optimization
This study concerns with a trust-region-based method for solving unconstrained optimization problems. The approach takes the advantages of the compact limited memory BFGS updating formula together with an appropriate adaptive radius strategy. In our approach, the adaptive technique leads us to decrease the number of subproblems solving, while utilizing the structure of limited memory quasi-Newt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i8.26158